Search CORE

329 research outputs found

A performance comparison of feature extraction methods for sentiment analysis

Author: Lai Po Hung
Rayner Alfred
Publication venue: Springer Verlag
Publication date: 01/01/2017
Field of study

Sentiment analysis is the task of classifying documents according to their sentiment polarity. Before classification of sentiment documents, plain text documents need to be transformed into workable data for the system. This step is known as feature extraction. Feature extraction produces text representations that are enriched with information in order to have better classification results. The experiment in this work aims to investigate the effects of applying different sets of features extracted and to discuss the behavior of the features in sentiment analysis. These features extraction methods include unigrams, bigrams, trigrams, Part-Of-Speech (POS) and Sentiwordnet methods. The unigrams, part-of-speech and Sentiwordnet features are word based features, whereas bigrams and trigrams are phrase-based features. From the results of the experiment obtained, phrase based features are more effective for sentiment analysis as the accuracies produced are much higher than word based features. This might be due to the fact that word based features disregards the sentence structure and sequence of original text and thus distorting the original meaning of the text. Bigrams and trigrams features retain some sequence of the sentences thus contributing to better representations of the text

UMS Institutional Repository

An optimized multi-layer ensemble framework for sentiment analysis

Author: Alfred Rayner
Lai Po Hung
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Public opinion plays an important role in decision making tasks of various fields. Sentiment Analysis is a key task in summarizing sentiment opinions as it classifies opinion documents according to its sentiment group of positive and negative. Machine learning based classification is efficient and versatile. The ensemble concept is used to improve classification accuracy by combining the decision of multiple classifiers. In this work, a framework for sentiment analysis is designed to extend the concept of ensemble upon all subtasks of machine learning classification in order to achieve better analysis. There are 3 subtasks in machine learning based sentiment analysis which are feature extraction, feature selection and classification. The ensemble concept is applied to all 3 tasks by combining different methods to perform the tasks and combine their results. optimization is performed by using Genetic Algorithm to find the combination of methods that could perform better. The proposed framework is tested on 4 different domain datasets and the sentiment analysis accuracy is shown to be very high. Future works includes testing the framework on different domains of classification and different optimization algorithm

UMS Institutional Repository

Beyond Sentiment Analysis: A Review of Recent Trends in Text Based Sentiment Analysis and Emotion Detection

Author: Lai Po Hung
Suraya Alias
Publication venue: ResearchGate
Publication date: 01/01/2023
Field of study

Sentiment Analysis is probably one of the best-known area in text mining. However, in recent years, as big data rose in popularity more areas of text classification are being explored. Perhaps the next task to catch on is emotion detection, the task of identifying emotions. This is because emotions are the finer grained information which could be extracted from opinions. So besides writer sentiments, writer emotion is also a valuable data. Emotion detection can be done using text, facial expressions, verbal communications and brain waves; however, the focus of this review is on text-based sentiment analysis and emotion detection. The internet has provided an avenue for the public to express their opinions easily. These expressions not only contain positive or negative sentiments, it contains emotions as well. These emotions can help in social behaviour analysis, decision and policy makings for companies and the country. Emotion detection can further support other tasks such as opinion mining and early depression detection. This review provides a comprehensive analysis of the shift in recent trends from text sentiment analysis to emotion detection and the challenges in these tasks. We summarize some of the recent works in the last five years and look at the methods they used. We also look at the models of emotion classes that are generally referenced. The trend of text-based emotion detection has shifted from the early keyword-based comparisons to machine learning and deep learning algorithms that provide more flexibility to the task and better performance

UMS Institutional Repository

Sentiment analysis based on probabilistic classifier techniques in various Indonesian review data

Author: Lai Po Hung
Mohd Shamrie Sainin
Nur Hayatin
Suraya Alias
Publication venue: Scientific Research Support Fund of Jordan
Publication date: 01/01/2022
Field of study

Sentiment analysis is the field in data science to achieve a broader holistic view of users’ needs and expectations. Indonesian user opinions have the potential to manage to be valuable information using sentiment-analysis tasks. One of the most supervised-learning techniques used in Indonesian sentiment analysis is the Naïve Bayes classifier. The classifier can be optimized and tuned in various models to increase the sentiment analysis model performance. This research aims to examine the performance of various Naïve Bayes models in sentiment analysis, especially when implemented in small datasets to handle overfitting problems. Four different Naïve Bayes models used are Gaussian, Multinomial, Complement and Bernoulli. We also analyze the effect of various pre-processing techniques on the models’ performance. Moreover, we build the first fashion dataset from the Indonesian marketplace which has a unique character compared to the datasets from other domains. Finally, we also use various datasets in the experiment to test the Naïve Bayes models' performance. From the experimental results, Complement Naïve Bayes is superior to other models, especially in handling overfitting with an F1-score of approximately 0.82

UMS Institutional Repository

SARS-CoV-2 Transmission in Alberta, British Columbia, and Ontario, Canada, December 25, 2019, to December 1, 2020

Author: Chowell Gerardo
Fung Isaac Chun-Hai
Hung Yuen Wai
Lai Po-Ying
Muniz-Rodriguez Kamalich
Ofori Sylvia
Publication venue: Digital Commons@Georgia Southern
Publication date: 25/03/2021
Field of study

Objective: This study aimed to investigate coronavirus disease (COVID-19) epidemiology in Alberta, British Columbia, and Ontario, Canada. Methods: Using data through December 1, 2020, we estimated time-varying reproduction number, R t , using EpiEstim package in R, and calculated incidence rate ratios (IRR) across the 3 provinces. Results: In Ontario, 76% (92 745/121 745) of cases were in Toronto, Peel, York, Ottawa, and Durham; in Alberta, 82% (49 878/61 169) in Calgary and Edmonton; in British Columbia, 90% (31 142/34 699) in Fraser and Vancouver Coastal. Across 3 provinces, R t dropped to ≤ 1 after April. In Ontario, R t would remain \u3c 1 in April if congregate-setting-associated cases were excluded. Over summer, R t maintained \u3c 1 in Ontario, ~1 in British Columbia, and ~1 in Alberta, except early July when R t was \u3e 1. In all 3 provinces, R t was \u3e 1, reflecting surges in case count from September through November. Compared with British Columbia (684.2 cases per 100 000), Alberta (IRR = 2.0; 1399.3 cases per 100 000) and Ontario (IRR = 1.2; 835.8 cases per 100 000) had a higher cumulative case count per 100 000 population. Conclusions: Alberta and Ontario had a higher incidence rate than British Columbia, but R t trajectories were similar across all 3 provinces

Georgia Southern University: Digital Commons@Georgia Southern

Enhanced Performance of Dye-Sensitized Solar Cells with Graphene/ZnO Nanoparticles Bilayer Structure

Author: Cheng-Chih Lai
Chih-Hung Hsu
Lung-Chien Chen
Po-Shun Chan
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Crossref

Predicting network traffic anomalies in Denial-of- service attacks – a nonlinear approach

Author: Ding Wei Lau
Po Hung Lai
Soo Fun Tan
Yu Beng Leau
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/10/2021
Field of study

The amount of data moving across the network at any given time is referred to as network traffic. It is the data units that are encapsulated in packets and sent over a network. Denial-of-Service (DDoS) attacks are various attempts to disrupt typical network, service, or server traffic. DDoS attacks attempt to disrupt legitimate users' work and data transfers by sending large packets or traffic. Various network traffic prediction techniques are investigated in this study, and a nonlinear time series method, Multilayer Perceptron Neural Network (MLPNN), has been chosen to evaluate network traffic prediction. The results with the NSL-KDD dataset show that the approach can improve prediction accuracy by up to 98.87%. With 2.26%, it outperforms other models such as Sequential Minimal Optimization (SMO)

UMS Institutional Repository